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idU  do-ument 

I  ior  p^llc  releose  and  sole;  it* 
ojstribiitioTj  Is 


NON-TECHNICAL  SUMMARY 


* 


We  suppose  chat  a  complex  machine,  consisting  of  m  parts,  has 
broken  down.  Checking  the  1^^  part  leads  to  a  cost  and  spots  the 
defect  with  probability  If  the  fault  Is  there.  Ar  discrete  time 
points  we  must  decide  either  to  check  one  of  the  parts  or  else  to 
Junk  the  machine.  Junking  the  machine  might  be  done,  tor  Instance, 

If  we  felt  chat  the  fault  was  In  a  part  which  would  be  too  expensive 
to  detect.  A  penalty  cObC  R  is  Incurred  If  the  machine  is  Junked. 

The  problem  Is  posed  as  a  sequential  search  and  stop  model  which 
Is  shown  to  Include  the  above  in  a  special  case.  A  prior  probability 
vector  P  ■  (P....P^)  is  given  -  i.e.  ^  {fault  in  part),  and 

a  major  result  Is  chat  In  the  above  problem  an  optimal  policy  either 
searches  a  part  with  the  maximal  present  probability  per  cost  of 
finding  the  fault  there,  or  else  It  Junks  the  machine. 
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A  PROBLEM  IN  OPTIMAL  SEARCH  AND  STOP 
Sheldon  M.  Ross 


1 .  Introduction  and  Summary 

The  following  model  has  been  considered  in  the  literature:  We  are  told 

that  an  object  Is  hidden  in  one  of  m  boxes  a'.d  we  are  given  prior  prol)- 

abilities  pj*  1*1,  2 . m  (Ep?  •  I)  that  the  object  is  in  the  i*^ 

box.  A  search  of  box  i  costs  c.  (c^  >  0),  and  finds  the  object  with 

probability  Oj  if  the  object  Is  In  the  box  (i.e.  I  *  Qj  Is  the  over* 

look  probability  for  the  i*^  box).  At  the  beginning  of  each  time 
period  t  ■  I,  2,  ...  a  box  Is  searched;  and  the  process  ends  when  th<! 
object  Is  found. 


Blackwell  (soe  (?])  has  shown  that  the  strategy  which  at  time  t  searches 
a  box  with  the  largest  present  value  of  ajp./c.  minimizes  the  expected 
searching  cost;  (where  p.  is  the  posterior  probability  at  time  t  that 
the  object  is  in  box  i).  Chew  [3]  and  Kadane  [a]  have  shown  that  if 
Cj  ~  I  then  this  strategy  also  maximizes  the  probability  that  the 
searching  cost  will  be  less  than  A  for  every  A  >  0. 


I 


In  this  paper  in  order  to  motivats  the  search  we  suppose  that  a  reward 
R|  m  is  earnjd  if  the  object  is  found  in  the  i^^  box.  We 

also  suppose  that  the  searcher  may  decide  to  stop  searching  at  any  time 
(for  example  he  may  feel  that  the  rewards  are  not  large  enough  to  justify 
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the  searching  costs).  If  the  searcher  decides  to  stop  before  finding  the 
object  then  from  that  point  on  he  Incurs  no  further  costs  and  of  course 
receives  no  reward, 

In  the  second  section  of  this  paper  we  show  that  an  optimal  strategy 
exists  and  is  defined  by  a  functional  equation.  The  optimal  strategy 
Is  exhibited  in  a  special  case.  The  third  section  deals  with  the  op* 
timal  n-stage  return  function.  The  fourth  section  presents  some 
counterexamples,  and  in  the  fifth  section  we  present  the  major  results. 
Speaking  loosely  we  show  that  the  optimal  strategy  either  searches  the 
box  with  maximal  value  of  a|p|/C|  or  else  it  never  searches  that  box. 
Also,  if  rewards  are  equal,  k.  =  R,  then  the  optimal  strategy  either 
searches  the  box  with  maximal  a|P|/C|  or  else  it  stops.  In  the  final 
section  we  assume  that  R|  =  R  and  present  a  sequence  of  strategies 
converging  to  the  optimal. 
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Optimal  Strategy 

A  strategy  Is  any  sequence  (or  partial  sequence)  j  ■>  (j|,  6^)  where 

6 1  c  { 1 1  2,  . . . ,  ffl}  for  !■! . s  and  s  c  {0,  1 ,  2,  The  pol  Icy 

6  Instructs  the  searcher  to  search  box  6|  at  the  1^^  period  and  to  stop 
searching  If  the  object  hasn't  been  found  after  the  s^^  search,  (s  ■  0 
means  that  the  searcher  stops  immediately  and  s  >  means  that  he  doesn't 
stop  until  he  finds  the  object). 

For  any  strategy  and  any  P  -  (pj,  ...,  p^) ,  pj  ^0,  Epj  -  I,  let  f(P,6) 

be  the  risk  (expected  searching  cost  minus  ‘ixpected  reward)  Incurred  when 

P  Is  the  vector  of  prior  probabilities  and  strategy  ^  Is  employed.  Also 

let  f(P)  ■  Inf  f(P/).  Then  it  follows  from  standard  arguments  (see  for 
6 

Instance  [i  ]  P.  83)  that 

(1)  f(P)  -  min  jo,  min  jcj  -  “,PjR|  +  (1  -  ®  jPj  I 

(  I«1 , . . ,m  *  '  '  ) 

where  T.P  •  ((T.P),,  (T.P)_)  i»l,  2 . .  and  where 

I  II  I  ITI 

jPjO-Vi)'' 

(2)  (T,P)j  -  < 

(  (I  -  -  “jPi)’'  j  •  J 

Thus  (T|P)j  Is  Just  the  posterior  probability  that  the  object  is  in  box 
J  given  that  a  search  of  I  has  not  uncovered  it.  We  shall  say  that  the 
process  Is  In  state  P  at  time  t  if  P  denotes  the  posterior  probability 


vector  at  time  t. 
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In  order  to  show  the  existence  of  an  optimal  strategy  let  R  -  max  R.  and 
consider  a  related  process  (the  prime  process)  with  c|  -  Cj,  Qj  0(|,  but 
with  rJ  ■  R|  -  R.  However  for  this  new  process  we  suppose  that  a  penalty 
cost  of  R  units  is  Imposed  if  the  searcher  decides  to  stop  searching  be¬ 
fore  finding  the  object  Now  It  is  easy  to  see  that  for  <  ny  strategy  6 
which  terminates  (eith^'  by  finding  the  object  or  by  stopping)  in  finite 
expected  time  we  have  f(P,6)  ■  f  (P,6)  -  R,  and  since  these  arc  the  only 
strategies  we  need  consider,  (any  strategy  which  doesn't  terminate  In 
finite  expected  time  has  f(P)  •  f' (P)  ■  <»)  It  follows  that  any  strategy 
optimal  for  the  p.  ime  process  is  optimal  for  the  original  one.*  However, 
the  prime  process  Is  a  dynamic  programming  process  with  a  finite  number 
of  possible  actions  available  at  each  stage  and  with  non-positive  returns 
at  each  stage  (since  Rj  £0VO«  It  then  follows  from  Strauch  [1]  that 
an  optimal  strategy  exists  and  also  that  the  optimal  strategies  may  be 
characterized  as  those  strategies  which  when  the  process  is  in  state  P 
chooses  one  of  the  actions  which  minimize  the  right  side  of  (I),  i.e.  for 
such  a  6*,  f(P,  6*)  ■  f(P)  for  all  P. 

The  Importance  of  rigorously  proving  that  an  optimal  policy  exists  and  is 
determined  by  a  functional  equation  cannot  be  overemphasized.  For  example 
in  the  above  suppose  we  relax  the  condition  that  c.  >  0  and  let  C|  ■  0. 

Then  if  OjPj  >  0  it  is  clear  that  for  any  strategy  6  »  (6j . 6^)  0 

(I,  I,  I,  ...).  f(P,  (I.  6, . 65))  <  f(P,  (6 . .  (since  a 

search  of  I  is  free)  and  thus  the  only  possible  optimal  strategy  would  be 


The  above  argument  also  shows  that  there  is  no  additional  generality 
gained  in  assuming  that  a  penalty  cost  c  is  incurred  when  the  searcher  stops 
without  finding  the  object,  as  this  process  would  just  be  equivalent  to  the 
original  one  with  rewards  R.  -f  c  instead  of  R.. 
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<1  ■  (I.  I,  I,  ...).  Kow„cr  r(P,J|)  . 

no.  t,  .ax,„o,.  F„n  ..a.pio  u 
and  Cj  .  I,  Oj  .  I.  pj  .  5/10,  .  lo  ,hj„  f(p_ 

"’  . . .  '■  '•  '•  '•  ■••'>  •  l7f''’('-(l/2)'')  *  sd/:)"]*  -jS^-S  t  Si 

L  J  '  0  1 0 

Also  the  a.r.tasy  da.ern,lnad  by  .ha  f„„o,|„„al  e,oa..on  .„r„s  op,  p.  ,h. 

(don-op.,™,)  a.na.apy  d,.  .rba  ..aacn  .ba.  .ba  ex.a.anca  p.pp/ 

breaks  down  Is  that  sinr#  r  «  n  !►  i 

sinoe  C|  .  0  It  no  longer  followa  that  all  strategies  I  , 

»lth  Infinite  expected  termination  time  have  f(P,6)  .  .) , 

NOW  consider  the  Cass  A  of  strategies  d  .  « . .  ^ 

*"/  policy  d  c  A  Which  finds  the  obgec.  with  probability  1  . . 

'(f.i)  •  Ejl  -  I  p.R|  where  L  Is  the  searching  cos.  Incvrredt  any  d  £  A 
which  has  posltly.  probability  of  never  finding  ,h.  object  has  f(P.d)  -  «. 

Thus  among  the  class  of  policies  which  never  stop  searching  until  the  object 
lb  fovnd  the  on.  with  minimal  ...... ed  searching  cos.  is  best.  Thus  by 

. . .  "-"ST  «.  which  when  P  searches  the  box 

(or  on,  of  the  boxes)  with  the  maximal  value  of  o.p./c,  is  optimal  among 
the  pol Icies  in  A. 

~~  ^  so™  i  then  no  optimal  strategy  stops 

searching  at  P  -  (p, . . J .  If  a.p,R,  >  c,  for  some  I  then  there  is 

an  optimal  strategy  which  doesn't  stop  at  P. 
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Proof,;  From  (I)  we  have  that 

f(P)  <  c,  -  a,p,R,  ♦  (I  -  o,p,)f(T,P) 

<0Ml  -a,p,)f(T,P) 

<  0 

and  to  f(P)  <  0  and  thus  no  optimal  policy  stops  at  P.  If  ajPjRj  >  Cj 
than  f|  (P)  =  c,  a|PjR|  ♦  (|  -  ajPj)f (TjP)  £  O.  Now  If  f (P)  .  O  than 
f(P)  -  f,(P)  and  so  searching  I  Is  optimal;  If  f(p)  <  0  then  stopping 
It  not  opt Imal ,  Q  E  D 

ffl 

Theorem  2.2;  If  I  c,/a|R|  <  I  then  5  It  optimal,  i.e.  f(P,6  )  •  f(p) 
for  all  P. 

PVoof;  For  any  P,  If  max(a,pjR,  -  c,)  >  0  then  there  exists  an  optimal 
strategy  which  doesn't  stop  at  P.  So  a  necessary  conditi  n  for  every 
optima!  strategy  to  stop  at  P  is  for 

®|P|P|  <  Cj  for  ail  I 

->  p,  <  Cj/ajR.  for  all  I 

•>  I  <  Sc,/0|Rj 

So  If  rc,/o,R,  <  I  then  for  every  p  there  Is  an  optimal  strategy  which 
devisn't  stop  at  P.  Thus  an  optimal  strategy  exists  in  A  which  implies 
that  6^  is  opt  imal . 


Q.E.O. 


The  Optimal  Return  f(P) 

Theorem  3 ■ I !  f{P)  is  a  concave  fuoction  of  P. 
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Proof :  Let  f j  (6)  be  the  conditional  risk  given  that  the  object  is  in  i 

and  strategy  6  is  employed,  I-l,  ....  m.  Then  f(P.5)  ■  Ep,f,(6).  Now 

i 

let  P  -  Xp'  +  (1  -  X)P^  then 

f(P)  -  Inf  f(P,6) 

6 

-  inf  f(Xp'  +  (I  -  X)P^,  6) 

6 

-  igf  ?  (XP'  +  (I  -  X)P^),f,(6) 

>  X  inf  E  p!f,(6)  +  (1  -  X)  inf  E  P?f.(6) 

“■  6  I  '  '  5  i  '  ' 

-  X  f(p‘)  +  (I  -  X)f(P^) 

Q.E.D. 

Corol lary  3-2;  The  optimal  stop  region  S  ^  {P  :  f(P)  ■  0}  is  convex. 

Proof:  Suppose  P  -  Xp'  +  (1  -  X)P^  and  f(p')  »  f(P^)  -  0.  Then 
f(P)  ^  0  by  (I)  and  f(P)  ^  0  by  the  above. 

Q.E.D. 


Let 

(3) 


f|lP)  ■  min  |o,  minjcj  '  a.p.R.jj 

f^(P)  -  min  jo.  minje.  -  a.p.R,  +  ('-aiP, )  Vl  1  } 


Thus  f^(P)  is  just  the  minimal  risk  incurred  if  the  searcher  is  allowed  at 

most  n  searches.  Clearly  f  (P)  >  f  ,(P)  >  f(P)  for  all  n,  all  P,  and  it 

n  ~  n+ 1  — 
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seems  reasonable  that  f  (P)  ^  f(P)  as  n  f  ®.  This  Is  shown  In  the 

n 


fo) lowing. 

Letting  c  ■  min  c,,  0  ■  max  (R,  -  c.) 

I  I  ,  i  I 


Theorem  3-3:  f.(P)  -  f(P)  <  —  all  n.  all  P. 

—  n  ”  nc 

Proof;  Let  6  be  an  optimal  strategy,  let  T  be  the  random  number  of  times 

searches  before  terminating,  and  let  d*  be  5*  terminated  at  n,  I.e. 

d*  ■  (dt  . . .  d*  ) .  Then 
n  I  s^n 

(4)  f(P)  -  f(P,d*)  -  E  Jx  I  T  <  nlP  fT  <  n]  +  E  Jx  I  T  >  n]P  iT  >  n] 

6  -  r  -  r 

and 

(5)  f„(P)  <  f (P,d*)  -  E  JX  I  T  <  n]P^(T  <  n]  +  E  *[X  |  T  >  n]P^[T  >  n] 

d  d 

n 

where  X  denotes  the  total  cost  incurred  (and  everything  Is  understood  to  be 
conditional  on  the  prior  probability  vector  P).  Thus 

(6)  f^(P)  -  f(P)  <  E  *[X  I  T  >  n]  -  E  *IX  I  T  >  n]  P^[T  >  n] 

[  ^  ^  J 

1  0  P^[T  >  n] 

To  get  a  bound  on  P^(T  >  n]  we  use  (4)  to  get 

(7)  0  i  f(P)  >  -D  P^[T  <  n]  +  (-0  +  nc)P|.(T  >  n] 


-0  nc  P  .T  >  n] 


(8)  P^[T  >  n]  £  0/nc 
The  result  follows  from  (6)  and  (8), 


Q.E.O. 
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'  orol  lary  3-^:  If  Ot|R|  <  Cj  for  all  i^l,  2 . m  then  f(P)  =  C,  i. 

: he  policy  which  never  searches  is  optimal. 

Proof:  It  follows  from  (3)  that  f|(P)  =  0,  and  by  induction  that 

f^(P)  =  0  for  all  n,  and  thus  by  the  above  f(P)  H  0.  (J.E.D. 

The  above  Corollary  may  also  be  proven  directly  by  letting  e'  be  the 
m-vector  of  all  zeroes  except  for  a  one  in  the  i**^  spot.  If  a.R.  <  c 
for  all  I  then  by  (I)  It  follows  that  f(e*)  -  0,  i-l,  ...,  m;  and  thu 
by  concavl ty  f (P)  =  0. 
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k.  Counter-Examples 

Consider  the  following  three  conjectures: 

1.  If  C|  >  R|  then  an  optimal  strategy  will  i-.ever  search  box  I. 

2.  If  an  optimal  strategy  doesn't  stop  at  then  It  searches  a  box 
wl  th  maximal  a|P|/C| . 

3<  If  m  Is  the  number  of  boxes  then  an  m-st.sge  look  ahead  strategy  Is 
optimal;  whera  an  m-stage  look  ahead  strategy  Is  defined  as  any 
strategy  which  stops  at  P  If  fp,(P)  ■  0.  ■nd  searches  the  box 
•t  f  If  f,(P)  -  c,  -  0|P|«|  ♦  (I  -  a,P|)  V|(T|P). 


Wa  shal I  now  give  examples  showing  that  each  of  these  conjectures  need 
not  hold. 

Example  I;  Oj  ■  I  O2  •  I 

P,  •  3/'*  Pj  -  I/'* 

Cj  ■  5  •  10 

R,  -  0  R^  •  210 


If  the  searcher  first  searches  2  and  then  acts  optimally  his  risk  is 
10  -  I  210  -  -I70A  ;  while  if  he  first  searches  1  and  then  acts  opti- 
aially  his  risk  Is  5  ■  ^  200  ■  ~kS  <  -170/^.  Thus  the  optimal  strategy 
starts  by  searching  I. 


Example  2: 


•  3/^* 
-  10 


0 


P2  ■  IM 

C2  «  10 
Rj  -  210 
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If  the  searcher  first  searcfies  I  then  hismiiimal  risk  is  10—^  200  =  -f)0; 
while  if  he  first  searches  2  his  minimal  risk  is  10  -  ^  210  <  -^0.  Thus 
the  optimal  strategy  starts  by  searching  2.  However  a|Pj/C|  ^  ~ 

V2''2' 


Example  3:  oij 

-  1 

“2  " 

.65 

=  A 

P2  ^ 

.6 

"l 

=  50 

"2  ' 

50 

O 

o 

M 

^^2  ' 

100 

It  can  be  checked  directly  that  f2(.^,  .6)  «  0  and  so  the  two-stage  look 
ahead  strategy  stops.  However 

fj(.4,  .6)  =  .l<(-50)  +  .6(100  -  (.65)100  +  .35  (50  -  100(.65))]  <  0 
and  so  the  two-stage  look  ahead  strategy  is  not  optimal. 


Thus  none  of  the  conjectures  need  be  true.  We  will  later  show,  however, 
that  in  a  special  case  (R.  ^  R)  conjectures  1  and  2  are  in  fact  true. 


Page  12 


5.  Main  Theorems 

For  any  strategy  6  let  (1,  J,  6)  be  the  str  which  first  searches  i 
then  J  and  then  follows  strategy  6. 

We  shall  need  the  following 


-emma  5- I :  For  any  strategy  6  such  that  f;P,5)  <  ® 


f(P^(I.J,6))  >  f(P,(J.I.6)) 


Proof: 


Iff  a,pj/C|  <  OjPj/Cj 
> 

■  a|P|  /I 

f(^(l,j,'S))  “  C|  ■  °‘|Pi'^l  ♦  (>-a.Pt)j  c,  -  R.  iVa-^-p 

‘  •'  ■*  r  i  ' 

fWj.I.fi))  •  Cj  -  ^PjRj  +  (l-a^Pj)  c,  -  R. 

now  since  TjTjP  ■  TjTjP  It  follows  that 

f(^(I.J,5))  -  f(P,(j,i,5))  -  “jPjCj  -  <»iP[Cj 


•a. Pi/  j 

LiL_\f(T. 
-a.p.  /  I 
J  J  > 


Q.E.D. 


Notat Ion;  For  any  pol Icy  6  ■  (6j ,  . . . ,  5^1  and  t  £  s ,  let 


6,t  5^  6^.,  6, 


Thus  S  ^  Is  Just  the  posterior  probability  vector  given  that  6  is  employed 
0  #  t 

and  the  item  has  not  been  found  after  t  searches. 


f 
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Theorem  5.2:  If  a.p./c.  ■  max  a.p./c.  then 
*  *  *  j  J  J  J 

/  .  0  *  * 

(a)  If  a|p.R|  ^  c.  then  there  is  an  optimal  strategy  6  having  6|  ■  I 

(b)  If  there  does  not  exist  an  optimal  strategy  with  6*  ■  i  then  no 
optimal  strategy  ever  searches  I. 


Proof :  (a)  We  first  show  that  there  is  an  optimal  strategy  5*  having 

A 

■  I  for  some  k  £  s.  For  suppose  that  no  optimal  strategy  ever  searched 


I;  then  for  any  optimal  strategy  6 


P|  for  all  t  and  so  by 


Lemma  2.1  the  optimal  strategy  need  not  stop.  But  then  6^  Is  optimal 
and  so  there  would  be  an  optimal  strategy  with  6*  ■  i .  Thus  there  is  an 

A 

Optimal  strategy  6  which  searches  i.  Let  k  be  the  first  time  6  searches 


I.  If 


k  f*  I  then  since 


cp,  J'l 


it  follows  that 


where  c.  <  c 
J  - 


and  so  by  Lemma 


5.1  there  is  an  optimal  strategy  with  0|^_|  ■  i.  By  induction  we  see  that 

* 

there  Is  an  optimal  strategy  with  5|  »  i. 

* 

(b)  We  have  shown  by  the  above  that  if  an  optimal  strategy  6 

*  * 

has  ■  i  for  some  k  then  there  is  an  optimal  strategy  with  5|  ■  i. 

Q.E.D. 


Corol  lary  5.3:  If  a.p?/Cj  >  OjPj/Cj  for  j  i  then 


(a)  every  optimal  strategy  has  6|  ■  i 


(b)  no  optimal  strategy  every  searches  i. 
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Proof:  Follows  in  the  same  manner  as  in  the  previous  Theorem. 


Note  that  If  the  state  of  the  process  at  time  t  is  P  then  from  that  point 
on  we  can  consider  the  process  as  starting  anew  with  prior  probability 
vector  P.  Thus  at  time  t  It  Is  optimal  to  search  the  oox  with  the 
largest  present  value  of  ap/c  or  else  that  box  Is  never  searched  from 
that  point  on.  We  are  able  to  prove  a  stronger  result  In  the  special 
case  where  all  rewards  are  equal. 


Theorem  5.^: 
cl ther 

(a)  there  Is 
or 

(b)  the  onl> 

1-0. 


Suppose  R,  =  R  for  al'i  I.  If  a,p?/c,  -  max  ci.p?/c 
I  •  •  '  j  J  J 

an  optimal  strategy  with  ■  I 


optimal  strategy  is  the  one  which  does  not  search, 


J 


then 


.e. 


*  ,  *  * .  * 

Proof:  Let  6  ■  (6|,  6^)  be  an  optimal  strategy.  If  6  ever  searches 

I  then  we  can  show  by  successive  permutations  (as  in  Theorem  5.2)  that  there 

*  A 

Is  an  optimal  strategy  with  6|  -  i.  If  6  never  searches  I  then  s  <  ®,  for 

If  6*  didn't  stop  and  never  searched  I  then  it  would  have  infinite  risk  and 

* 

so  wouldn't  be  optimal.  Suppose  now  that  s  i*  0  and  let  k  -  6  .  Since  k  will 

V  s 


be  the  last  search  made  It  follows  that  a, 


/  0 
P  A 
\  6  . 


s-l 


)l^  R  2.  ^k  (o'"  t 


would  be  bettei  make  the  last  search).  But  since  5  never  searches 

and  thus 


I  It  follows  that(p®* 


\  6 


Ji 


'I 


c.  c,  0  -  c.  — i 


l/R 
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But  then  by  Lemma  2.1  it  would  be  optimal  to  search  I  at  time  s  I  ,  and 

so  by  the  above  there  would  be  an  optimal  strategy  with  6|  -  I. 


Q.E.D. 


In  a  similar  manner  we  may  prove  the  following 


Corollary  5-5:  If  R.  =  R  and  if  a.p^/Cj  i  max  a^Pj/Cj,  then  any  strategy 


6  with  6j  -  i  is  not  optimal 


Proof:  Let  t  be  such  that  -  max  a^Pj/Cj.  If  6  searches  I  at  some 

time  then  by  successively  permuting  and  using  Lemma  It  follows  that  we 
may  (strictly)  improve  upon  5.  If  6  never  searches  I  then  by  the  same 
reasoning  as  used  in  the  above  Theorem  it  follows  that  5  can't  be  optimal. 

Q.E.D. 


Thus  when  all  rewards  are  equal  it  is  either  optimal  to  search  a  box  with 
the  maximal  value  of  a.p./Cj  or  else  it  is  optimal  to  stop. 

In  ni  Chew  considered  the  problem  where  there  is  no  reward  given  for 
finding  the  object  but  where  there  is  a  penalty  cost  C  incurred  if  the 
searcher  stops  without  finding  the  object.  He  also  supposed  that  Qj  ■  0 
and  p^  >  0.  (Thus  there  is  positive  probability  that  the  object  is  in 

A 

the  first  box  but  with  probability  one  a  search  would  overlook  it.) 


*ActualIy  Chew  supposed  that  Ep?  <  I.  However  this  is  clearly 
0  '  ' 

equivalent  to  having  Ip.  ■  I  and  having  a  box  with  an  overlook  probability 
of  one. 
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H«  showed  that  If  C|  =  1  then  the  optimal  strategy  either  searches  the 
box  with  maximal  a|P|/C|  or  else  stops.  However,  as  was  previously  pointed 
out,  this  problem  Is  equivalent  to  the  one  we've  considered  with  =  C. 
Thus  Theorem  5.^  may  be  considered  as  an  extension  of  Chew's  result  to 
non-constant  costs  and  to  general  overlook  probabilities. 
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6 .  Approximat I ons  to  Optimal  Strategy 

In  this  section  we  suppose  that  H  R,  and  exhibit  a  sequence  of 
strategies  which  converge  to  an  optimal  strategy. 


*  .it  ifc 

Let  0  -  (6|,  6^)  be  an  optimal  strategy  which  either  when  In  state 

P  stops  if  f(P)  ■  0  or  else  searches  a  box  with  maximal  value  of  ajpj/Cj. 

* 

Let  T  be  the  random  number  of  stages  6  searches  before  terminating,  and 
recall  that  c  ■  min  c..  We  shall  need  the  following; 


Lemma  6.1;  ^  f') 


for  all  n 


Proof: 


The  minimal  value  of  max  a.p,/c,  Is  achieved  by  that  vector  P  having 

I  '  '  ' 


(9)  ®|P|/'l  ■  "  ’•••  "  Vm^^m 


and  thus 


(10)  min  max  a.p,/c.  •  - 

^  '  I  ^i/“i 


Now  each  time  6  searches  a  box  with  maximal  value  of  a.P|/Cj.  Thus  each 
time  6*  searches  a  box  (say  box  j)  the  probability  OjPj  the  item  will  be 
found  is  such  that 


!")  ».P.  2: 


f  '/“I 


The  result  follows  immediately. 


Q.E.O. 


-.•Pi. 
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Now  let  6^  •  (<5j,  ....  6  )  be  the  strategy  which  when  in  state  P  stops 

*n 

if  f  (P)  ■  0  or  «tlse  searches  a  box  with  maximal  vaiue  of  a,p,/c,,  i.e. 
n  /  ,  \  i'^i  i’ 


■  min  (  k 


Since  f  (P)  I  f(P)  it  follows  that 
n 


$_  +  $  as  n  f 
n 


Recalling  that  D  ;  max  (R-c,)*R-cwe  have 


1 1  ▼  a 

Theorem  6.2:  f(P,6")  <  f(P)  +  D(l  -  c/Ec,/a,)  "  for  all  P,  all  n. 


Proof:  f(P,6")  -  f( 


m 


<  D  P^(T  >  n)  P^(T  >  5^) 


where  the  last  inequality  follows  from  (6).  The  result  then  follows  from 


Lemma  6.1. 


Q.E.O. 


In  order  to  effectively  apply  the  policies  d*',  n  ^  1,  we  need  to  be  able 

to  characterize  the  continuation  sets  A  E  jp:  f  (P)  <  o|  .  These  sets 

n  i  n  ) 

can  be  constructed  as  follows: 


A,  .  jp:  ]i:  c,  -a,p,«<  0[ 


Aj  -  A, U  82 


where 
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(13) 


B2  -  j  P:  ]  i.J:  c,  -  a,P|R  +  (1-a.p.)[cj  -  “j(T,P)jR]  <  o[ 


Noting  that  (T.Plj  -  (I -a,  6_  )pj  (| -a.  p , )  ’  ’  where  6 


ij 


1  i-J 
0  ii<J 


we  can  wr 1 te 


(IM  B,  - 


2  I 


c, 


a.p.R  +  c. 
‘  '  J 


“j-j"  -  ‘  0 


Similarly 


A3  -  U  B3 

where 

(15) 

®3  ■  {'■'  p]  • 

a.p.R  +  (I-a.i 
r  1  '  r 

'i'h' 

*a. (T,P).R  + 

J  i  J 

(l-ajd.Pj.Xc^- 

<  0  j 

-  jp:  ]i.j.k:  - 

“iPi"  *  'j  • 

a.p.R 

’  'k  ■  Vk" 

■  “iP|'j  ■  <“iP 

i  *  “i''j)'=k  * 

J  'J 

|Pj  (R  »  c,^) 

*  “k  Pk"<V  * 

*ik>  -  “k  ^k 

V  ’’k 

^R  <o| 

Similarly  the  other  •  A^_|  (J  may  be  obtained.  Also  we  may  let 


(16)  b|  -  A, 


c,  -c.,p,«*c.  -ajp.,  -<.,p.Cj<o| 
oj  •  jp:  j  Mj^k:  C|  -  a,p.k  *  c.  -  P.p.R  »  -  .^p^R 

Vi'j  -  '“iPi  *  VjK  ‘“I 


I  "1 

Then  B  CB  and  we  may  approximate  A  by  U  B..  We  also  note  that 
n  n  n  .  ,  I 

I'l 

bJ  -  Aj  and  B^  -  A^, 


( 
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